Training RNNs as Fast as CNNs
نویسندگان
چکیده
Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU) architecture, a recurrent unit that simplifies the computation and exposes more parallelism. In SRU, the majority of computation for each step is independent of the recurrence and can be easily parallelized. SRU is as fast as a convolutional layer and 5-10x faster than an optimized LSTM implementation. We study SRUs on a wide range of applications, including classification, question answering, language modeling, translation and speech recognition. Our experiments demonstrate the effectiveness of SRU and the tradeoff it enables between speed and performance. We open source our implementation in PyTorch and CNTK1.
منابع مشابه
Human Pose Estimation with CNNs and LSTMs
Human pose estimation from images and videos has been a very important research field in computer vision. In this thesis, we present an end-to-end approach to human pose estimation task that based on a deep hybrid architecture that combines convolutional neural network (CNNs) and recurrent neural networks (RNNs). CNNs used to map the input image to feature space (fixed dimensionality), and then...
متن کاملA Study of All-Convolutional Encoders for Connectionist Temporal Classification
Connectionist temporal classification (CTC) is a popular sequence prediction approach for automatic speech recognition that is typically used with models based on recurrent neural networks (RNNs). We explore whether deep convolutional neural networks (CNNs) can be used effectively instead of RNNs as the “encoder” in CTC. CNNs lack an explicit representation of the entire sequence, but have the ...
متن کاملMonitoring tool usage in cataract surgery videos using boosted convolutional and recurrent neural networks
With an estimated 19 million operations performed annually, cataract surgery is the most common surgical procedure. This paper investigates the automatic monitoring of tool usage during a cataract surgery, with potential applications in report generation, surgical training and real-time decision support. In this study, tool usage is monitored in videos recorded through the surgical microscope. ...
متن کاملDNPU: An 8.1TOPS/W Reconfigurable CNN-RNN Processor for General-Purpose Deep Neural Networks
Recently, deep learning with convolutional neural networks (CNNs) and recurrent neural networks (RNNs) has become universal in all-around applications. CNNs are used to support vision recognition and processing, and RNNs are able to recognize time varying entities and to support generative models. Also, combining both CNNs and RNNs can recognize time varying visual entities, such as action and ...
متن کاملSified Stochastic Gradient Descent
Prior work has demonstrated that exploiting the sparsity can dramatically improve the energy efficiency and reduce the memory footprint of Convolutional Neural Networks (CNNs). However, these sparsity-centric optimization techniques might be less effective for Long Short-Term Memory (LSTM) based Recurrent Neural Networks (RNNs), especially for the training phase, because of the significant stru...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1709.02755 شماره
صفحات -
تاریخ انتشار 2017